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DECISION ON APPEAL 
STATEMENT OF THE CASE 
Janakiraman, Dutta, and Schwerdtfeger (Appellants) appeal under 
35 U.S.C. § 134 from the Examiner's Final Rejection of claims 1, 3 through 
8, 10 through 15, and 17 through 21, which are all of the claims pending in 
this application. 
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Appellants' invention relates to an improved system for presenting 

multimedia data to users with disabilities. See Specification, page 1 . Claim 

1 is illustrative of the claimed invention, and it reads as follows: 

1 , A method for presenting text from moving video to a user, the 
method comprising: 

receiving multimedia data containing a plurality of moving video 
frames and an associated plurality of sets of text data, wherein the associated 
plurality of sets of text data are associated in time with the plurality of 
moving video frames, wherein the plurality of sets of text data includes a 
first text data set associated with a first plurality of moving video frames of 
the multimedia data, and a second text data set associated with a second 
plurality of moving video frames of the multimedia data; 

extracting the associated plurality of sets of text data from the 
multimedia data; 

extracting a first video frame, from the first plurality of moving video 
frames, associated with the first text data set to form a first still image; 

extracting a second video frame, from the second plurality of moving 
video frames, associated with the first^ text data set to form a second still 
image; 

outputting the first text data set in association with the first still image; 

and 

outputting the second text data set in association with the second still 
image. 

The prior art references of record relied upon by the Examiner in 
rejecting the appealed claims are: 



Since the second plurality of moving video frames is associated with a 
second text data set, it appears that this should read "second." We note that 
the same inconsistency appears in each of the independent claims. 
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Loui US 6,813,618 61 Nov. 02, 2004 

Bergen US 6,956,573 Bl Oct. 18,2005 

Isabel F. Cruz (Cruz), A User-Centered Interface for Querying Distributed 
Multimedia Databases, ACM SIGMOD Record, Vol. 28, Issue 2, 590-93, 
(1999). 

Claims 1, 3 through 6, 8, 10 through 13, 15, and 17 through 20 stand 
rejected under 35 U.S.C. § 103(a) as being unpatentable over Loui in view 
of Bergen. 

Claims 7, 14, and 21 stand rejected under 35 U.S.C. § 103 as being 
unpatentable over Loui in view of Bergen and Cruz. 

We refer to the Examiner's Answer (mailed October 1 1 , 2006) and to 
Appellants' Brief (filed September 12, 2006) for the respective arguments. 

SUMMARY OF DECISION 
As a consequence of our review, we will affirm the obviousness 
rejection of claims 1, 3 through 6, 8, 10 through 13, 15, and 17 through 20, 
but reverse the obviousness rejection of claims 7, 14, and 21. 

OPINION 

Appellants contend (Br. 13-17) that Loui and Bergen fail to suggest 
plural moving video frames with a first set of text data associated in time 
with first moving video frames and a second set of text data associated in 
time with second moving video frames. Appellants further contend that 
Loui and Bergen fail to teach extracting a first video frame associated with 
the first set of text data and a second video frame associated with the second 
text data set to form first and second still images. The main issue, therefore, 
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is whether the combination of Loui and Bergen teaches or suggests 
extracting a first video frame associated with a first set of text data and a 
second video frame associated with a second set of text data and forming 
two still images therefrom. 

Loui discloses (col. 1, 11. 61-67) that video clips can be placed in a 
digital album by selecting a key frame for static display to identify the video. 
Thus, Loui suggests extracting (by selecting) a video frame (a key frame) 
from plural video frames (the video clip) to display a frame representative of 
the video clip. Further, Loui discloses (col. 2, 11. 1-5) that modern cameras 
allow associating textual data with digital images. Loui (col. 5, 11. 40-48) 
describes display 20 of Figure 3 as showing four digital photographs, each 
with associated text describing the photograph. Thus, Loui suggests 
associating text with the images to describe them. 

Bergen (col. 2, 11. 29-32 and 44-47) describes a database that provides 
scene-based video information to a user by dividing a video stream into 
scenes, each made up of plural frames, with a key frame for each scene. 
Bergen further discloses (col. 4, 11. 12-21) providing information associated 
with the video to identify portions of one or more frames or scenes. The 
information may be summaries or textual descriptions of the scenes. (See 
col. 10, 11. 31-36.) Thus, Bergen further suggests extracting a key frame 
from each of a plurality of video streams and associating text with each 
extracted frame. We note that Appellants argue (Br. 17) that Bergen "does 
not teach or suggest dividing a video stream based on sets of text data 
associated in time with moving video frames." However, independent 
claims 1, 8, and 15 do not require dividing the video stream "based on" text 
data. The claims merely call for extracting two video frames associated with 
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first and second sets of text data, respectively, and forming still images 
therefrom, which is suggested by both Loui and Bergen. 

Appellants further contend (Br. 17-19) that the Examiner has failed to 
point to a teaching, suggestion, or motivation in the prior art to combine 
and/or modify Loui and Bergen. The Supreme Court recently held that in 
analyzing the obviousness of combining elements, a court need not find 
specific teachings, but rather may consider "the background knowledge 
possessed by a person having ordinary skill in the art" and "the inferences 
and creative steps that a person of ordinary skill in the art would employ." 
SeeKSRInt'lv. Tele/lex Inc., 127 S. Ct. 1727, 1740-41, 82 USPQ2d 1385, 
1396 (2007). Since Loui and Bergen describe such similar systems, it would 
have been obvious to the skilled artisan to use steps/elements from one for 
the other. 

Appellants contend (Br. 21-23) that the Examiner used impermissible 
hindsight in combining Loui and Bergen because each presented a complete 
solution to the problem they faced, and, thus, the skilled artisan would not 
have been motivated to combine/modify them. However, Bergen has not 
been used to modify Loui, but rather reinforces the teachings and 
suggestions made by Loui. Accordingly, the Examiner has not used 
impermissible hindsight. Since we have found that Loui and Bergen suggest 
extracting two video frames associated with first and second sets of text 
data, respectively, and forming still images therefrom, we will sustain the 
obviousness rejection of claims 1, 8, and 15, and dependent claims 3 through 
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6, 10 through 13, and 17 through 20, which have not been separately 
argued.^ 

Appellants (Br. 23-25) contend that Cruz, added to the primary 
combination by the Examiner for rejecting claims 7, 14, and 21, fails to 
suggest discarding remaining moving video frames from the first plurality of 
moving video frames, as recited in each of claims 7, 14, and 21. The second 
issue, therefore, is whether Cruz, in combination with Loui and Bergen, 
suggests discarding the remaining video frames. 

The Examiner relies (Answer 9) on deselecting the "video" checkbox 
in Figure 2 of Cruz as suggesting discarding remaining video. However, the 
checkbox operates to determine whether or not video is to be displayed. 
Cruz does not address whether remaining video should be discarded after a 
single frame has been extracted from the video stream. Since Loui and 
Bergen also fail to address this limitation, we will reverse the obviousness 
rejection of claims 7, 14, and 21. 

ORDER 

The decision of the Examiner rejecting claims 1, 3 through 8, 10 
through 15, and 17 through 21 under 35 U.S.C. § 103 is affirmed as to 



We note that Bergen discloses (col. 20, 11, 14-44) a video book in which a 
temporal index of a movie can be presented as a series of frames, wherein 
each frame represents a scene from the movie. Each scene has a prewritten 
description of the contents which can be requested after the frames are 
viewed. Thus, Bergen's video book includes extracting frames from a video 
with text data associated in time with the video frames. The main difference 
between Bergen and claim 1 appears to be that Bergen displays the text after 
the user views the still images, not in association with the still images. 
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claims 1, 3 through 6, 8, 10 through 13, 15, and 17 through 20, but reversed 
as to claims 7, 14, and 21. 

No time period for taking any subsequent action in connection with 
this appeal may be extended under 37 C.F.R. § 1 .136(a). See 37 C.F.R. 
§ 1.136(a)(l)(iv). 



AFFIRMED-IN-PART 



tdl/gw 
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